Genetic diversity among maize (Zea mays L.) inbred lines adapted to Japanese climates

Understanding the genetic diversity of inbred lines is vital for development of superior F1 varieties. The present study aimed to analyze Japanese maize parental inbred lines and determine their genetic diversity for future breeding. Genetic analyses were conducted using multiple methods. Principal component analysis (PCA), phylogenetic trees, and Bayesian clustering reflected borders between heterotic groups according to the derivation of each inbred line. A self-pollinated line derived from a classic F1 variety and another line from an open-pollinated population from the same derivation were classified as separate components by PCA and Bayesian clustering. The result suggests that open pollination could be essential in modern breeding. Of those classified as dent or flint based on their derivation, some had a combination of all components or clusters. Therefore, the classification of inbred lines should be based on their derivation and DNA markers. The findings will be valuable for breeding and genetic studies in Japan. Additionally, these techniques may be used to obtain a more significant number of SNPs and related phenotypic data.


Introduction
Concerns about the long-term food supply in Japan have led the Japanese government to promote a policy regarding raising food production for self-sufficiency [1].Japanese public sectors contribute to these efforts by breeding high-yield maize (Zea mays L.) varieties that are well adapted to Japanese climates.
While maize is grown as a major crop in several countries, Japan has almost entirely relied on maize imports.The current policies regarding self-sufficiency in food supply have gradually increased the area under cultivation with maize in Japan.Developing new maize F 1 varieties better suited to Japanese climates will increase in maize production.
Maize grown for silage and grain usage differ primarily in their ripening times.While the former is harvested at the yellow ripening stage about 40 days post-silking in the Kanto region of Japan, the latter is harvested on attaining full maturity after an additional three to four weeks.Therefore, the variety's relative maturity (RM) needs to be shorter as per the cropping system.
Understanding the genetic diversity and population structure of inbred lines is essential to development of superior F 1 varieties [2][3][4] and will enable the classification of inbred lines into heterotic groups, the selection of efficient mates or testers in F 1 development, and the introgression of superior genes from diverse genetic resources [4][5][6][7].Japan spans a long distance from north to south, with the northernmost region, Hokkaido, opting to grow maize varieties and inbred lines with earlier ripening adapted to cold climates.The genetic basis of these varieties has been previously reported [8] and can be potentially exploited as an essential tool for breeding early maturing varieties in warmer regions.Additionally, reports on the genetic diversity of inbred lines grown south of the Kanto region are available [9,10].However, these varieties have not been comprehensively analyzed yet.The present study aimed to extensively study Japanese parental maize inbred lines and to analyze their genetic diversity.

Plant materials
Table 1 lists details of the 127 inbred lines used in this study.All of these were developed in the public sector research stations, namely, Hokkaido, Miyakonojo, and Nasushiobara of NARO (The National Agriculture and Food Research Organization, Japan), or Prefectural public breeding sections in Nagano, Japan.The names of these inbred lines start with the first letter of the related locations; "Ho (Hokkaido)", "Mi (Miyakonojo)", "Na (Nasushiobara)", and "CHU (Chushin area)", respectively."Ki" is used as the name of a breeding line in Nagano Pref.and the first letter for the Kikyogahara area in Shiojiri, Nagano Pref., but it is also the name of an inbred line at Kasetsart University in Thailand [11].To avoid confusion, we used conservation names starting with "J" and "JC" instead of "Ki".All the lines had clear derivation and heterotic groups based on breeding history documented by the developing sectors.The inbred lines were classified into some groups based on their genetic backgrounds as described by Enoki et al. [8], and partially modified following Tamaki et al. [9].These included flint mainly developed or derived from the European region (EF), tropical inbred lines mainly developed from hybrids for summer seeding (RD), flint mainly derived from Japanese landraces (JF), dent mainly derived from US corn-belt dent (MD), and miscellaneous origin (MIS).
DNA preparation.DNA was extracted as previously described by Tamaki et al. [10].Briefly, a fresh leaf section weighing about 1g from each seedling growing in a greenhouse was cut using scissors, frozen with liquid nitrogen, and milled using 'Multi-beads shocker1' (Yasui Kikai Corporation, Osaka, Japan) thrice for 10 seconds at 1500 rpm under frozen conditions.DNA was extracted using the 'DNeasy Plant Mini Kit™' (Qiagen, Venlo, Netherlands) per the manufacturer's instructions using 100 μl of the supernatant collected post-milling.DNA concentration was measured using a 'Qubit™ 2.0 Fluorometer' and 'dsDNA HS assay kit' (Thermo Fisher Scientific, Massachusetts, USA).DNA concentrations were adjusted to 10 ng/ μl for sequencing reactions.
Genotyping.All inbred lines were genotyped using 'Maize LD Bead chip' (Illumina Inc, San Diego, USA) containing 3,047 single-nucleotide polymorphisms (SNPs) and analyzed using the software 'GenomeStudio 2.0'.While the software allows operators to adjust settings to judge genotypes on each SNP locus manually, the authors opted to follow the automatic judgment made by the software.A custom report in Plink format using Report Wizard was generated after analysis.
Markers with more than 5% missing data and less than 3% minor allele frequency were removed by 'Plink' [12] version 1.90.Highly correlated SNPs were removed by linkage disequilibrium pruning using Variance Inflation Factor = 2, resulting in 1,007 SNPs that were analyzed further.
Statistics and population structure analysis.Statistical analyses were adopted to investigate the genetic distinction of the inbred lines based on the 1,007 SNP marker profiles.Population structure was estimated using principal component analysis (PCA), Bayesian clustering, and maximum likelihood (ML) phylogenetic analysis.PCA was performed using Plink.Bayesian clustering was conducted using ADMIXTURE [13].We assumed K = 2-10, and the optimal K value was estimated based on cross-validation error (CVE) values calculated per the ADMIXTURE manual.An ML phylogenetic tree was constructed using MEGA X ver.10.18 [14] according to the Tamura-Nei (1993) [15] model with 1000 bootstrap replicates.The tree was drawn using the unweighted pair group method with the arithmetic mean (UPGMA) method.The mean pairwise genetic distance of proportion (p) of nucleotide sites was calculated by MEGA X as genetic distance (GD).

Results
The frequency distribution of pairwise GDs for 127 maize inbred lines genotyped at 1,007 SNPs is shown in Fig 1 .Table 2 lists the maximum, minimum, and mean GD within and among groups.S1 Table also lists all the GD matrices of 127 inbred lines.The GDs between pairwise comparisons of the inbred lines varied from 0.004 to 0.421, and the overall mean distance was 0.332.Most (48.9%) of the GDs fell between 0.350 and 0.400, with the lowest GD (0.004) being observed between 'Ho120' and 'Ho128', both of which were in the EF group but of different derivations.The highest GD of 0.421 was observed between 'Na94' and 'Ho124', which were classified with MIS (MD*JF) and EF heterotic groups and were derived from different derivations.The mean GDs between different heterotic groups tended to have low values, and minimum GDs within the same groups were relatively low, except for the RD series with lower N numbers.However, certain inbred lines had high GDs even within the same groups.
The PCA results classified EF, RD, JF and MD well (Fig 2).Inbred lines belonging to MIS, including 'Na57', 'J1707', and 'Na72' were classified near the midpoint between the JF and MD populations.Notably, 'Na94', of MIS origin of MD and JF, and 'Na112', also of EF and JF, were classified in the JF population.
The results of the population structure analysis were confirmed using a phylogenetic tree, in which the 127 genotyped inbred lines formed some groups, with each group further divided into subgroups (Fig 3).The groups agreed with the previously ascertained classification by each derivation.'Ho95', attributed to the JF mass population, was classified to the edge of the JF group.The optimal K value was estimated to be 3 (CVE = 0.970).The second cluster was preferential in the EF group.The JF group was dominated by the first cluster and the MD group by the third, with the proportion of each cluster fluctuating as the JF and MD group approached their borders.While all clusters were approximately equal in the RD group and differed from all other heterotic groups, the proportions of the clusters were very similar to some MD inbred lines.Some of the five MIS groups and 'Ho95' contained some clusters outside the classified group.However, several inbred lines had similar clusters besides their heterotic groups.

Discussion
Genetic diversity is crucial in selecting genotypes to initiate new breeding programs.New inbred lines from one group are expected to perform well in combination with inbred lines from other groups.F 1 hybrids result from crosses between inbred lines of different heterotic groups, the development of which is facilitated by establishing genetic similarities between inbred lines [4,5,7].Different heterotic groups may also arise from inbred lines with a common derivation [16,17].Genotyping is one of the most reliable approaches to documenting polymorphisms in selected inbred lines [16].
The genetic diversity of Japanese maize inbred lines was analyzed using multiple methods.PCA, phylogenetic trees, and Bayesian clustering reflected the heterotic group borders according to the derivation of each inbred line, with marginal variations due to differences in each method.
The lowest GD value was 0.004 between 'Ho120' and 'Ho128' (EF groups), derived from the triploid cross and mass selection.The GD value of 'Na89' and 'JC-009' (JF) was 0.07, derived from the same population but selected in different regions (Table 1).Although the breeding years of the two EF lines were close and the same breeders selected them, the EF lines of different origins had more significant genetic similarities than JF lines of the same origin.Warburton et al. [18] have reported that DNA markers may be better indicators of inbred lines in cases where those derived from the same population are more distinct than those derived from different populations.The findings of our study concur with this insight.
Although the average GD between different heterotic groups tended to be low, certain inbred lines had high GD even within the same group.For instance, the MD group's 'J1383' and 'J1706' had the highest GD at 0.392, with different components and clusters of structure analysis and clustering.Based on our empirical findings, F 1 progenies between dent lines tend to be superior in terms of grain yield than those between dent and flint combinations, which may be an essential insight for future use of these combinations within the same group.Crosses should be made between inbred lines of different populations or the same population with high GD to ensure the development of productive F 1 hybrids.
PCA results indicated a clear genetic distinction between the EF, RD, JF, and MD, except in cases where certain JF and MD inbred lines were close.While 'J1539', classified with MD groups, was located close to the border with JF, 'J1383', with the same origin, was classified with MD groups.Notably, the former is derived from an open-pollinated population from a classic F 1 variety, and the latter, from self-pollination of itself.
Clustering analysis is the process of inferring the ancestry of inbred lines from genotype information [19].The SNPs analyzed in this study revealed the existence of three subpopulations (K = 3) in 127 inbred lines.Inbred lines with similar derivation tended to cluster within the same group.Thus, the SNPs classified inbred lines into heterotic groups based on derivation and similar genetic backgrounds.Interestingly, 'J1383' and 'J1539', which have the same origin but a GD value of 0.303, were classified as separate components and clusters by PCA and clustering.Thus, the findings suggest that open pollination, a classic and effective tool for inducing genetic modification, could be essential in modern breeding.
'Ho95' belongs to the JF groups, which is based on the Caribbean-type flint breeding population and has a relatively later ripening period than other inbred lines bred in Hokkaido.Our previous study [10] has classified this line separate from other JF groups, which the phylogenetic tree and clustering analysis confirmed.The results of Bayesian clustering (k = 3) showed that 'Ho95' and RD groups had all clusters.Similarly, several other inbred lines had all clusters but would be classified as MD, JF, or EF group only if our recorded derivation was considered (S2 Table ).'Na112' is an MIS line derived from both EF and JF according to its derivation but had few second clusters in common with EF and a minimum GD of 0.358 from EF. Relying on these classifications will only be possible after examining them based on the origin of the underlying population and DNA markers [20].Since the origin of these inbred lines especially before the 1990s, is based on old handwritten records, further details, including the possibility of human error, should be examined in the future.
The low cost of SNP arrays allows the analysis of numerous samples.However, given that they are developed using reference genomes, they can be confounding in diversity studies.This has been exemplified by the observation of significant confirmation bias by Ganal et al [21] using 'maize SNP50' by Illumina Inc.The present study's findings do not contradict previous study findings on genetic diversity by Rad-seq [10], and their derivations.This suggests that the SNPs array analysis used in this study helps understand genetic diversity.The development of accurate, inexpensive, and reproducible genotyping platforms has been a primary driver of genotypic studies, including those on genomic prediction [22].The SNPs array 'Maize LD Bead chip' has already been discontinued and alternative methods will be needed in the future.Other tools for genome-wide SNP analysis should also be considered for association studies.In the future, these techniques may be used to obtain more SNPs and related phenotypic data, which could provide further insight [2,3,5,23,24].
The results of this study will serve as a valuable resource not only for maize breeding in Japan but also for genetic studies, including association mapping or genomic prediction, where genetic divergence and extended LD patterns of inbred lines are required.

Fig 4 depicts
Fig 4 depictsBayesian clustering by Admixture.The optimal K value was estimated to be 3 (CVE = 0.970).The second cluster was preferential in the EF group.The JF group was dominated by the first cluster and the MD group by the third, with the proportion of each cluster fluctuating as the JF and MD group approached their borders.While all clusters were approximately equal in the RD group and differed from all other heterotic groups, the proportions of the clusters were very similar to some MD inbred lines.Some of the five MIS groups and 'Ho95' contained some clusters outside the classified group.However, several inbred lines had similar clusters besides their heterotic groups.

Fig 4 .
Fig 4. ADMIXTURE model clustering from K = 2-4.The percentage belonging to each cluster is indicated by the length of the color bar (yaxis).CVE: Cross-validation error.The sequence of each inbred line is the same as in Fig 3. Red shape: Miscellaneous origin.https://doi.org/10.1371/journal.pone.0297549.g004

Table 1 .
(Continued) EF, RD, JF, MD and MIS indicate flint mainly developed or derived from the European region, Japanese tropical inbred lines mainly developed from hybrids for summer seeding, Japanese flint landraces, Japanese dent mainly derived from US corn-belt dent, and miscellaneous origin, respectively.‡ HARC, ILGS, KARC, and CAES are abbreviations for Hokkaido Agricultural Research Center, NARO, Kyushu Okinawa Agricultural Research Center, NARO, Institute of Livestock and Grassland Science, NARO, and Chushin Agricultural Experiment Station, Nagano pref., respectively. https://doi.org/10.1371/journal.pone.0297549.t001